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Professional Cloud DevOps Engineer 


> Pay attention for 5 minutes, before we dive in. 


> Advance certification 


> Expectation S 
© 
> Basics of Compute Engine, ; 
> Kubernetes, Docker É 
CH X 
© PROFESSIONS € 
D U ESSI S 
> Learn by Doing © Dev ops wm 


» 20/80 
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GCP certifications 


é 
© QV 
NETWORK EN 


https://cloud.google.com/certification/cloud-devops-engineer 
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Cloud Cost for this course 


> SO - for GCP account 


> GCP Free trial 


» $300 for next 3 months https://cloud.google.com/free 


2 y 
ROFESSIOND 


» Length: Two hours 7 S 
g Devops ENS 


> Registration fee: $200 (plus tax where applicable) 
> Languages: English 


» Exam format: Multiple choice and multiple select, 
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Google Cloud Devops 
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Google DevOps 


> Apply site reliability engineering principles to a service 
> Build and implement CI/CD pipelines for a service 


> Implement service monitoring strategies 
> Optimize service performance 


> Manage service incidents 
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GCP Basics 


> Google Cloud Overview 

> Create GCP Account 

> GCP Console Walkthrough 
> GCP Regions & Zones 

> Creating GCP Projects 

> Google Cloud Shell 
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SRE - Site Reliability 
engineering 
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SRE 


> History of Software Development Cycle 
> DevOps δι SRE 

> Role of SRE 

> Eliminating Toil 

> Blameless Postmortem 

> SLI, SLO & SLA 


» Error Budgets 
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History 


> History of Software Development Cycle 
> DevOps δι SRE 

> Role of SRE 

> Eliminating Toil 

> Blameless Postmortem 

> SLI, SLO & SLA 


» Error Budgets 
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History 


DEVELOPERS OPERATORS 
» Developer write code > Operator know how to deploy δι monitor application 
> Update software > Operators don’t know how to write code 
» Adding new feature > They know how to assemble code 
> Don't bother about stability > Solve Production issue 
> They want to push code faster to Prod > How to scale Application 


> They love less updates 


MN => (X) 
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Devops 


> DevOps is a set of practices, guidelines and culture 
> which designed to reduce the gap between software development and software operations. 


> If Both team work together, productivity will increase 


> DevOps established five goals. 
> Reduce organizational silos 
> Accept failure as normal 
> Implement gradual changes 
> Leverage tooling and automation 
> Measure everything 
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SRE 


> There is problem with devops 
> Goal of Devops is broad. 
> Devops does not define how to implement it. 


> This is How SRE comes. 


> Devops is Philosophy where SRE is implementation of Devops 's Philosophy. 


> class SRE implements DevOps 
> SRE Practices 
> SRE Role 
> blameless postmortems 
> error budget 
> reduce toil 
> track service level metrics , SLIs, SLOs, and SLAs. 


© ANKIT MISTRY — GOOGLE CLOUD 


SRE Role 


> Specific job role 

> Old operator role -> SRE Role 

> A Site Reliability Engineer is basically the result of asking a software engineer to design an operations team 
> SRE requires experience in both development as well as operations 


> SRE spends half of their time doing ops-related work 
> production issues, attending call, performing manual interventions 


> SRE spends other half of their time in development task, Scaling system, automation 
> Compared to old operator, both SRE & Developer share responsibility of Prod Server 
> SREs build the tools that developers use to compile, test, and deploy their code. (CI/CD Pipeline) 


> Developers and SREs work together to fix issue 
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Blameless postmortems 


> One of goal of DevOps is accept failure as normal 

> Failure is un-avoided, However good system you design. 

> Once you change system, risk is involved 

>» If your rate of change is zero, risk is also zero. But that means you are stopping growth. 
» Need to balance between change & risk. 

> You can take it as opportunity to grow business, if Things break, fix it. 

> Fix will teach you lot of thing, minimize future issue. 


> In SRE, you can accomplish with Blameless postmortems 
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Blameless postmortems (cntd..) 


> Idea behind Blameless postmortems 
> is to analyze system failure 
> Root cause behind it. 
> Discuss about what has happened exactly 
> What action need to be performed. 


> Not to look for someone who can be blamed. 
> Assumption is — everyone had good intentions 


> Some postmortems question need to be asked. 
> When incident begin δι end? 
> How incident get notified 
> Who are all involved 
> Which system are affected 
> What is root cause of failure 
> How to avoid in future 
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Blameless postmortems (Cntd...) e» 


> Accept that With Human error are involved. 


> Blameless postmortems is 
» Honest Communication with other team member so that similar incident can be avoided in future 
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Ton | 


> One of goal of DevOps is leverage tooling and automation 

> There is lots of task are manual, laborious. 

> Task like Password Change, Copy Files, Creating new Folders, Restart Servers 
» These type of task are considered as Toil. 

> Identifying Toil is important. 

> Not all Task are Toil. 

> There are task which is laborious but not necessary is toil. 


> Toil is related 
> Prod system 
> Manual, repetitive & automatable task 
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> SRE want to reduce Toil by automation. 


> Task like 
> Automate CI/CD Pipeline 
> Schedule Jobs 
> Write some Automation scripts 
> Automate testing 
> No manual Provisioning hardware 
> If Repetitive task automated, It should be automated 


> Due to Automation, more resource can work something more interesting 


> SRE should spend significant amount of time in reducing toil. 


© ANKIT MISTRY — GOOGLE CLOUD 


Error budget 


> One of goal of DevOps is implement gradual change 


> Why outage occurs 
> Added new feature, change, new hardware, security patches 


> More change leads to less stable system 

> How to balance between change δι stability 

> We have to define metric for high system reliability. 

> It is business Problem 

> how much can the service fail before it begins to have a significant negative impact? 
> How quickly do we need to be able to release new features? 


> Depending on target, need to define error budget 
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Error budget (Cntd..) 


> Anytime your service is down, time require to recover it will be consumed from error budget 


> After you define error budget 
> as long as you are within error budget, you are good to go for more changes 


> Once you run out of error budget, need to hold all future changes for deployment δι make system stable 
first 


> Larger error budget 
» means more downtime for service acceptable, 
> frequent changes possible. 


> Less error budget, 
> means less downtime for service acceptable, 
> lesser changes allowed. 


> Error budget make sure smaller δι gradual changes deployed. 
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SLI, SLO & SLA 


SLO 


> Service level objective 

> It is internal objective of team 

> SLO is something everyone in org want to achieve 
» Error Budget is directly related to SLO 

> It kind of complement to Error Budget 

> Error — 3% means service is down 3% at max 

> SLO — 97% means service should be up for 9796 

» Error Budget + SLO = 10096 


> Define SLO with respect to latency, Availability, Response Time 
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SLI 


> Service level indicator 

> Indicator internal to team 

> SLI needs to be compared against SLO 

> SLI are metrics which track over time (generally 5 minutes interval) 


> SLI ranges from 0 to 100% 
Total Good Event 


| EE 
3 Total Valid Event 


X 100 


> Let's say SLO - 96% 
> 96% of request should be serve within 300 ms latency. 


> If Current SLI is 95% or anything less than 96%, system is under performing. 


> SLI help us to find which service are not performing as per SLO 


> Good SLI leads customer haPPY © ANKIT MISTRY - GOOGLE CLOUD 


SLI (Cntd..) 


> Good SLI leads customer happy 
> If Changes to SLI does not impact customer, SLI definition is not worth 


> Different signal to track 
> Latency 
» Traffic 
» Errors 
> Saturation 
> Availability of system 


> Selecting right SLO & SLI will lead to success 
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SLA 


> Service level agreement 

> Itis contract with consequences of failing to meet the SLOs they contain 
> SLO & SLA are quite similar 

> But your SLAs should not be the same as your SLOs 


> SLO is an internal objective, 
> |f you can not meet SLO, team can slow down changes 


> SLAs violations are shared with your customers 
>» If you can not meet SLA, compensate need to be provided to customers 


> https://cloud.google.com/terms/sla 


> SLI should be higher than SLO & SLA, means current indicator shows services are performing as expected 
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SLA (Cntd..) 


> If SLI goes below SLO, slow own 

> If SLI goes below SLA, notify customer δι compensate 

> Higher SLA Good but more likely you will violate it 

> Lesser SLA means You will meet but customer will have less confident in your services 


> Google recommendation in case very high SLA 
> Down your service for some time 
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docker 


Docker 
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Docker 


> Docker is software development platform 

> Here you packaged app in images 

> Container use image to start application 

> Containers run on any operating system 

> |t works exactly same independent of OS, machine, Environment 
> Lightweight compared to VM 

> Easier to maintain & deploy 


> Docker works with any language, runtime, OS 
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Docker vs VM 


Virtual Machine Virtual Machine Virtual Machine 
Containerized Applications 


Host Operating System 
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Docker workf low 


Docker Installation 


Ka my MENS 


Dockerfile Docker Image 


| 


Docker Container 


Container Registry 


AWS ECR 
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Create Simple webapp 


> Python based Web Application 
> main.py 
> Dockerfile 


> Build Docker images 


> Push to Container Registry 
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Deploy App 


Two thing to consider 


> Where you want to deploy 


> What are deployment strategy 


© ANKIT MISTRY — GOOGLE CLOUD 


Compute Options 


© © H 


App Cloud 


Compute Kubernetes i 
Engine Run 


Engine 
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Cloud 
Function 


Deployment methods 


> Blue/green Deployment 
> Rolling Deployment 
> Canary Deployment 


> Traffic splitting Deployment 
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Blue/green Deployment 


© 
( ` 
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Rolling Deployment 


Mo en wen vo 
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Canary Deployment 


ο 


ν2.0 ν2.0 ν2.0 ν2.0 ν2.0 ν2.0 ν2.0 ν2.0 
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Traffic Splitting 


> Small Percentage of user will be served new version (ex : 10-20%) 
> |f everything is fine, Redirect all user to new version. 


» Traffic splitting can be used for A/B Testing. 
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Deploy Cloud Functions 
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Deploy to App Engine 
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Deploy to Cloud Run 
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Deploy to Kubernetes 


© ANKIT MISTRY - GOOGLE CLOUD 


Deploy to Compute Engine 


> IAAS - Infrastructure as a service 
> General Purpose computing machine 


> 2 ways Deployment 
> Containerized App 
> Via Container optimized OS 
> Via Other OS + manual Docker installation 
> Non Containerized App 
> Manualinstall apache 


> Install Via startup script 
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Deploy to Instance 


> Instance Template 
> Blue print for all Virtual machine 


> Create Instance from template 


> Instance group 
» managed 
» unmanaged 


> Load balancer 
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CI/CD Pipeline 


Continuous integration 


& 


‘| a — mm 
NI 
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Continuous Deployment 


f> = EE — EN — BEEN 
QE. 
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Continuous Deployment 
vs Continuous Delivery 


» Continuous Deployment 
> Fully automated, no manual intervention 


> Code is continuously build δι deploy 


> Continuous Delivery 
> Release to Production 
> May involve manual approval 
> It will make sure delivery are often & fast 
> Before Continuous Delivery, frequency of release usually one in 3-month 
» Now, Possible to release 5 times in day 
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Different Services for 
CI/CD 
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Source Code management 


U Bitbucket (s 
(4 


mercurial 


C) GitHub 9 


Cloud Source Repository 
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Build 


® Jenkins fre 


J 


KH m 
circleci a 


Cloud Build 
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Artifact Storage 


O 


JFrog Artifactory 


Container Registry 


Artifact Registry 
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Deployment 


E: © © () JI 


Compute Kubernetes App Cloud Cloud 
Engine Engine Run Function 
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CI/CD Pipeline - 1 


Create Docker Image & Push to Container Registry 


Source Code | | 
Cloud Build to Build 


e Dockerfile Images 


ο Main.py 
repo-1 cicd-1 
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Push Image to registry 


imagel 


CI/CD Pipeline - 2 


Deploy Python Web app to Google App Engine 


Source Code 


e app.yaml Cloud Build Deploy to App Engine 


e main.py (cloudbuild.yaml) 
e requirements.txt 


repo-2 cicd-2 
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CI/CD Pipeline - 3 


Deploy to Google Cloud Function 


Source Code 
Cloud Build 


e main.py 


Deploy to Cloud Function 


e requirements.txt (cloudbuild.yaml) 
e function-source.zip 


repo-3 cicd-3 
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CI/CD Pipeline - 4 


Deploy to Google Cloud Run 


Source Code Cloud Build (cloudbuild.yaml) 


1 .Docker Build 
* app.py 


e Dockerfile 2. Docker Push Deploy to Cloud Run 
ο Requirements.txt 3. gcloud run deploy 


repo-4 cicd-4 
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CI/CD Pipeline - 5 


Deploy to Google Kubernetes Engine 


Source Code Cloud Build (cloudbuild.yaml) 


1 .Git clone 
* app.py 


a osten 2. Docker Build Deploy to GKE 
3. Docker Push 


4. Kubectl set image 


repo-5 cicd-5 
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GKE CI/CD Steps 


> Create GKE Cluster 

> Deploy some sample Docker Images 

> Create Load balancer based service 

> Create Source Code Repos 

> Add Python code, Docker file, cloudbuild.yaml 


> cloudbuild.yaml 
> clone repo 
> Build Image 
> Push image 
> Update new image 


> Create Cloud Build Trigger 
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Jenkins 


> Popular Open source tool for CI/CD 
> Alternative to Cloud Build 


> Jenkins can be extended with plugins 
> Source code - git 
> Unit testing — junit 
> GCloud SDK 


> Installation 
> VM + manually install Jenkins 
> marketplace solution (Preferred) 
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IAC - 
Infrastructure as a code 
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IAC 


> Infrastructure as a code 

> Process of managing and provisioning cloud resources 
with some descriptive language 

> Create Shell/Python script for creating VM 

> But writing/maintaining such code is tedious task 


> Need better language to create resource 


Create N/W 
Wait for above step to finish 
Provision Subnet 


Create Firewall rule 
Wait for above step to finish 
Compute engine instance with all parameter 


resource "google compute instance" 
"first-instance"( 

name = "hello-1" 

zone = "us-centrali-a" 

machine type = "ni1-standard-1" 


boot disk ( 
initialize params ( 
image - "debian- 
cloud/debian-9" 
i 


j 


network interface ( 
network = "default" 


i 
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Tool for IAC 


> Cloud Native tool available for infrastructure provisioning 
> Azure — Template 

> Google - Deployment manager 

> AWS - Cloud Formation 

> JSON/YAML 


> Terraform is cloud agnostic. 


> With Multiple provider, resource can be provisioned for multiple cloud. 
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Terraform 


> Terraform is the one of the most popular tool for Infrastructure provisioning 
> Free — Open source 

> Developed by HashiCorp 

> Quick δι easy to get started with single binary file 


> Master HCL - terraform in short span of time ree”) 
> Terraform has multiple provider are available. / 
> Apart from Public cloud, lots pf different other provider are available for À À7 U [e 


network, DNS, Firewall, database 


> Write configuration in HCL/JSON. 
> HCL is preferred. 


> Terraform is agentless tool 


> It is not configuration tool. Work well with Ansible. 
(O ANKIT MISTRY — TERRAFORM 


Terraform 
Installation 
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Terraform - Create VM 
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Secure Container 
Dep loyment 
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Secure Container Deployment 


> Google managed base images 
» Container analysis 


> Binary authorization 
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Secure Base Image 


Container 


Layer 4 » Base image is made of Ubuntu/debian based OS 


» Choosing right base image is important 
Layer 3 


> So, How to pick right base image 


Layer 2 > Solution is : Google marketplace 


> Google maintained these images & deploy for their 
own app deployment. 


Layer 1 
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Container Scanning 


> Container Analysis provides 
> automated vulnerability scanning 


> manual vulnerability scanning 
> For containers in Artifact Registry and Container Registry 
> Works exactly same for both Registry 


» Manual 
> gcloud artifacts docker images scan imageurl -remote 


» Automate 
» Let's see in action 
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Scanning & Base image Demo 
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Binary authorization 


> How to Prevent from deployment 

» Binary authorization 

> Binary Authorization is a deploy-time security control 

> |t ensures only trusted container images are deployed on Google Kubernetes Engine (GKE) or Cloud Run 


> Let's see in action 
> For Cloud Run 
> For GKE 
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Cloud Operation Tool 
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Operation Tool 


> Operation like Monitoring, Logging 

> Why Logging - Monitoring is required 

> What is Logging 

» Kinds of Log — Audit Logs 

> Log Collection 

> Log Routing 

> Log Export 

> Cloud Monitoring — Metrics, Dashboard, Uptime check, Alerts 


> Cloud Debugger, Trace, Profiler, Error Reporting 


© ANKIT MISTRY — GOOGLE CLOUD 


Why such tool 


> Software Development + Maintenance 

> Everyone want their software run smoothly 
> But No software is bug free 

> issues come at dev stage, Test or Prod level 
> How to find root cause behind it 


> You need to continuously monitor resources 
> Space is sufficient 
> Is application is slow 
> Is CPU usage going beyond 90% 
> Who did What with Prod (even if by mistake) 


> Soto know all those answer δι many more, such tool is required 


© ANKIT MISTRY — GOOGLE CLOUD 


Cloud Monitoring 


» Monitor various cloud Resources 
> Different Metrics can be measured 
> Monitor one or more GCP Project or AWS Account 


> Workspace 
> Multiple metrics can be added 


> Default workspace & custom workspace 


> Let's see in action — Monitoring UI 
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Cloud Logging 


> Log management tool 

>» Fully managed service 

> Store Exabyte scale data 

> Log can collected from multiple source 
> Search δι analyze log 


> Let's explore Logging UI 
> Logs Explorer, Dashboard, Log Metrics, Logs Router 
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Types of Cloud Audit Logging 


Who did what, when, where 


By Default Not 
Enabled 
Create, modify 
Resource Data 


By Default Enabled By Default Enabled 


By Default Enabled 


Generated by Google 
System 


400 days 


Google Service denies 


Admini , , 
dministrative action access 


400 days 30 days 30 days 


Not Free 


Free Not Free 


Free 


Create Object in 
Bucket 


VM Migration, 


Create VM, Delete VM Preemtive VM 


Can not Configure, Can 
not Disable 


Security violation 


Can not Configure, Can 
not Disable 


Can not be disabled. But can 
be excluded with Filters 


Can be disable 


[Hands-on] 
Cloud audit Logging 
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Explore Audit Log Structure 
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Log Collection 


> Log read/write via gcloud SDK 


> Automatically 
> Cloud Run, GKE, App Engine 


> Logging Agent 
» For Compute Engine on Google cloud / AWS VM 
> Legacy agent/ Ops agent 

> Cloud Logging API 


> Python/Java SDK 
> From On-premises 
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[Hands-on] Log based metrics 
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Log Router 


Log arrives at Log Router from various sources 
From Router, diverted to various sink 


Two types of Log Bucket 
_ Required 
_ Default 


Logs can be routed to User defined Bucket 


Sinks 
BigQuery 
Cloud Storage 
PubSub 


(O ANKIT 


Ingesting and routing logs 
with the Log Router 


_Required 
*3 log sink 


Storing, viewing, 
and managing logs 


: Cloud Logging 
: storage 


_Required 
log bucket 


400-day retention 
Non-configurable 


Using logs in the Google Cloud ecosystem 


Cloud 
Storage 


Logs data 


4 
€*9 Cloud Logging API Log-based 
τ idis ? metrics 


n & 
„Default J User-defined 
M log sink M log sinks 


„Default User-defined 
log bucket log buckets 


30-day retention 30-day retention 
Configurable Configurable 


Cloud 
Monitoring 


BigOuery Pub/Sub 


More Ops/Dev Tool 


> Cloud error reporting — detect error 
> Cloud Debugger - Find state of running application 
> Cloud Trace - latency 


> Cloud Profiler - How much resource consumed 
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Optimize resource utilization 


> Resource cost, utilization levels, Billing 
> Pre-emptible VMs 
> Committed use discounts [CUD], sustained use discounts[SUD] 


> TCO considerations 
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Preemptible VM 


> Just like Other virtual machine 
> Short lived cheaper virtual machine 


> Provision Pre-emptible VM When 
> Workload is fault tolerant 
> Not require 100% high availability 
> Cost is critical 


» up to 80% discount 
» max life is 24 hours 
> Not always available 


> Google give you 30 sec warning before auto shutdown 
> Regular VM has higher priority than Preemptible VM 


> Let's see how to configure it 
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Flat-rate, 
committed use discounts[|CUD|, 
sustained use discounts [SUD] 
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Flat Rate 


> Pay for what you use 
> No Special Discount 


> |n Compute Engine : 
> E2 and A2 category of Machine 
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Sustained use discounts [CUD ] 


> Sustained use discounts are automatic discounts for 
running specific Compute Engine resources a significant 
portion of the billing month 


> Applies to N1, N2 machine types 
> Not applicable to other machine type 


> |f you use at least 25% of month 


Effective Discount 


> Only on GKE & VM Instances 


> Let's see in action 


4096 60% 80% 100% 


% of month used 
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Committed use discounts [CUD] 


> Let's say your workload is predictable 
> you can commit for 1 year or 3 year 
> Get up to 70% of discount. 

> Only on GKE & VM Instances 

>» Can not cancel commitments 


» Let's see in action 
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Total cost of operations (TCO) 


> TCO = Purchase Cost of Asset + Cost of operation 


> When moving to Cloud from on Premises 
> Cost need to consider 
> In GCP, No purchase of asset 
> Provision Resources with no minimum commitment (Expect few service feature) 
> Cost include (Pay as you go model) 
> Operation Cost 
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THANK YOU 


